{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qsGKhkub2LTf"
      },
      "source": [
        "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pinecone-io/examples/blob/master/docs/onboarding-recommender.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/pinecone-io/examples/blob/master/docs/onboarding-recommender.ipynb)\n",
        "\n",
        "# Recommendation Systems\n",
        "\n",
        "Recommendation systems have revolutionized the way we discover and explore new things. These intelligent systems utilize sophisticated algorithms and data analysis to understand individual preferences and provide personalized recommendations.\n",
        "<br><br>\n",
        "By analyzing user data, such as **browsing history**, **purchase patterns**, and **social interactions**, recommendation systems can effectively predict and suggest items that align with users' interests. Whether it's suggesting a new movie to watch, a book to read, or a product to buy, these systems streamline decision-making and enhance the overall user experience.\n",
        "<br>\n",
        "With their ability to uncover hidden gems and introduce users to exciting possibilities, recommendation systems have become invaluable tools in navigating the overwhelming abundance of choices in today's digital landscape.\n",
        "\n",
        "### Recommendation Systems and Vector Databases\n",
        "\n",
        "In recommendation systems, understanding the similarity between users and items is crucial for generating accurate and personalized recommendations. By leveraging vector databases, these systems can store and organize user and item vectors, which capture the essential characteristics and preferences associated with each user and item.\n",
        "<br><br>\n",
        "The vector database employs <a href=\"https://www.pinecone.io/learn/vector-database/#:~:text=a%20vector%20database.-,Algorithms,-Several%20algorithms%20can\">advanced indexing techniques</a>, to enable fast retrieval of **similar users or items** based on their **vector representations**. This enables recommendation systems to efficiently process *large-scale datasets* and identify meaningful connections, leading to more precise and relevant recommendations.\n",
        "<br><br>\n",
        "By harnessing the power of vector databases, such as **Pinecone**, recommendation systems can optimize their performance, enhance user satisfaction, and deliver tailored experiences that align with individual preferences.\n",
        "<br><br><br>\n",
        "Let's take a look at how we can implement one of those use cases!"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mxuvFMzo2LTg"
      },
      "source": [
        "We start by installing all necessary libraries."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 1,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "rvEdmJGF2LTh",
        "outputId": "45fe7415-9dee-4597-b368-8ec171426476"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m177.2/177.2 kB\u001b[0m \u001b[31m3.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m7.2/7.2 MB\u001b[0m \u001b[31m57.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m588.3/588.3 MB\u001b[0m \u001b[31m2.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m60.0/60.0 kB\u001b[0m \u001b[31m5.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m300.4/300.4 kB\u001b[0m \u001b[31m24.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m1.3/1.3 MB\u001b[0m \u001b[31m61.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m1.1/1.1 MB\u001b[0m \u001b[31m46.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m12.3/12.3 MB\u001b[0m \u001b[31m62.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m16.4/16.4 MB\u001b[0m \u001b[31m51.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m34.9/34.9 MB\u001b[0m \u001b[31m15.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m3.1/3.1 MB\u001b[0m \u001b[31m84.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m268.8/268.8 kB\u001b[0m \u001b[31m24.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m7.8/7.8 MB\u001b[0m \u001b[31m96.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m1.3/1.3 MB\u001b[0m \u001b[31m70.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m1.7/1.7 MB\u001b[0m \u001b[31m72.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m6.0/6.0 MB\u001b[0m \u001b[31m100.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m439.2/439.2 kB\u001b[0m \u001b[31m24.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m224.5/224.5 kB\u001b[0m \u001b[31m22.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m223.6/223.6 kB\u001b[0m \u001b[31m20.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m223.0/223.0 kB\u001b[0m \u001b[31m21.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m218.0/218.0 kB\u001b[0m \u001b[31m20.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m218.0/218.0 kB\u001b[0m \u001b[31m21.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m211.7/211.7 kB\u001b[0m \u001b[31m19.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m341.8/341.8 kB\u001b[0m \u001b[31m32.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m73.4/73.4 kB\u001b[0m \u001b[31m8.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m4.9/4.9 MB\u001b[0m \u001b[31m92.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m781.3/781.3 kB\u001b[0m \u001b[31m56.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m11.1/11.1 MB\u001b[0m \u001b[31m51.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m143.1/143.1 kB\u001b[0m \u001b[31m13.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m120.3/120.3 kB\u001b[0m \u001b[31m12.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m115.6/115.6 kB\u001b[0m \u001b[31m11.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m115.5/115.5 kB\u001b[0m \u001b[31m11.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m115.3/115.3 kB\u001b[0m \u001b[31m10.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m115.1/115.1 kB\u001b[0m \u001b[31m11.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m114.6/114.6 kB\u001b[0m \u001b[31m12.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n",
            "google-cloud-bigquery 3.10.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 3.19.3 which is incompatible.\n",
            "google-cloud-bigquery-connection 1.12.1 requires google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.0, but you have google-api-core 2.8.2 which is incompatible.\n",
            "google-cloud-bigquery-connection 1.12.1 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 3.19.3 which is incompatible.\n",
            "google-cloud-bigquery-storage 2.22.0 requires google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.0, but you have google-api-core 2.8.2 which is incompatible.\n",
            "google-cloud-bigquery-storage 2.22.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 3.19.3 which is incompatible.\n",
            "google-cloud-datastore 2.15.2 requires google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.0, but you have google-api-core 2.8.2 which is incompatible.\n",
            "google-cloud-datastore 2.15.2 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 3.19.3 which is incompatible.\n",
            "google-cloud-firestore 2.11.1 requires google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.0, but you have google-api-core 2.8.2 which is incompatible.\n",
            "google-cloud-firestore 2.11.1 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 3.19.3 which is incompatible.\n",
            "google-cloud-functions 1.13.2 requires google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.0, but you have google-api-core 2.8.2 which is incompatible.\n",
            "google-cloud-functions 1.13.2 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 3.19.3 which is incompatible.\n",
            "google-cloud-language 2.9.1 requires google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.0, but you have google-api-core 2.8.2 which is incompatible.\n",
            "google-cloud-language 2.9.1 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 3.19.3 which is incompatible.\n",
            "google-cloud-translate 3.11.3 requires google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.0, but you have google-api-core 2.8.2 which is incompatible.\n",
            "google-cloud-translate 3.11.3 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 3.19.3 which is incompatible.\n",
            "google-colab 1.0.0 requires pandas==1.5.3, but you have pandas 2.0.3 which is incompatible.\n",
            "grpc-google-iam-v1 0.12.6 requires protobuf!=3.20.0,!=3.20.1,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 3.19.3 which is incompatible.\n",
            "pandas-gbq 0.17.9 requires pyarrow<10.0dev,>=3.0.0, but you have pyarrow 11.0.0 which is incompatible.\n",
            "tensorflow-datasets 4.9.2 requires protobuf>=3.20, but you have protobuf 3.19.3 which is incompatible.\n",
            "tensorflow-hub 0.14.0 requires protobuf>=3.19.6, but you have protobuf 3.19.3 which is incompatible.\n",
            "tensorflow-metadata 1.14.0 requires protobuf<4.21,>=3.20.3, but you have protobuf 3.19.3 which is incompatible.\u001b[0m\u001b[31m\n",
            "\u001b[0m"
          ]
        }
      ],
      "source": [
        "!pip install -qU \\\n",
        "    pinecone-client==3.1.0 \\\n",
        "    pinecone-datasets==0.7.0 \\\n",
        "    transformers==4.30.2 \\\n",
        "    tensorflow==2.11.1 \\\n",
        "    pinecone-notebooks==0.1.1"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bo-qDSBZ2LTh"
      },
      "source": [
        "---\n",
        "\n",
        "\ud83d\udea8 _Note: the above `pip install` is formatted for Jupyter notebooks. If running elsewhere you may need to drop the `!`._\n",
        "\n",
        "---"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fBowTnsb2LTh"
      },
      "source": [
        "## Data Preparation\n",
        "\n",
        "<img alt=\"Onboarding recommender diagram\" src=\"https://raw.githubusercontent.com/pinecone-io/examples/master/docs/assets/onboarding_recommender_data_flow.jpg\"  width=\"70%\">"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JXFYLvCz2LTi"
      },
      "source": [
        "#### Downloading the Dataset\n",
        "\n",
        "We will download a pre-embedding dataset from `pinecone-datasets`. Allowing us to skip the embedding and any other preprocessing steps.\n",
        "<br><br>\n",
        "When working with your own dataset you will need to perform this embedding step but we have prebuilt the embeddings so we can jump right to the action."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 206
        },
        "id": "DNBAE4zf2LTi",
        "outputId": "59a8b688-a6b4-4491-d782-6f75e5fd0a85"
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "\n",
              "\n",
              "  <div id=\"df-7c134bec-5265-4fea-8977-a20d4a8581da\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>values</th>\n",
              "      <th>sparse_values</th>\n",
              "      <th>metadata</th>\n",
              "      <th>blob</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>tt5027774</td>\n",
              "      <td>[-0.12388430535793304, 0.23021861910820007, -0...</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>{'imdb_id': 'tt5027774', 'movie_id': 6705, 'po...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>tt5463162</td>\n",
              "      <td>[0.008479624055325985, 0.3665461540222168, -0....</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>{'imdb_id': 'tt5463162', 'movie_id': 7966, 'po...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>tt4007502</td>\n",
              "      <td>[-0.0022702165879309177, 0.5886886715888977, -...</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>{'imdb_id': 'tt4007502', 'movie_id': 1614, 'po...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>tt4209788</td>\n",
              "      <td>[0.08350061625242233, 0.4322584867477417, -0.2...</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>{'imdb_id': 'tt4209788', 'movie_id': 7022, 'po...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>tt2948356</td>\n",
              "      <td>[-0.1614755392074585, 0.41389355063438416, -0....</td>\n",
              "      <td>None</td>\n",
              "      <td>None</td>\n",
              "      <td>{'imdb_id': 'tt2948356', 'movie_id': 3571, 'po...</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-7c134bec-5265-4fea-8977-a20d4a8581da')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "\n",
              "\n",
              "\n",
              "    <div id=\"df-2ee2b259-723b-4cf1-8747-50478b57956a\">\n",
              "      <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-2ee2b259-723b-4cf1-8747-50478b57956a')\"\n",
              "              title=\"Suggest charts.\"\n",
              "              style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "      </button>\n",
              "    </div>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "    background-color: #E8F0FE;\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: #1967D2;\n",
              "    height: 32px;\n",
              "    padding: 0 0 0 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: #E2EBFA;\n",
              "    box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: #174EA6;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "    background-color: #3B4455;\n",
              "    fill: #D2E3FC;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart:hover {\n",
              "    background-color: #434B5C;\n",
              "    box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "    filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "    fill: #FFFFFF;\n",
              "  }\n",
              "</style>\n",
              "\n",
              "    <script>\n",
              "      async function quickchart(key) {\n",
              "        const containerElement = document.querySelector('#' + key);\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      }\n",
              "    </script>\n",
              "\n",
              "      <script>\n",
              "\n",
              "function displayQuickchartButton(domScope) {\n",
              "  let quickchartButtonEl =\n",
              "    domScope.querySelector('#df-2ee2b259-723b-4cf1-8747-50478b57956a button.colab-df-quickchart');\n",
              "  quickchartButtonEl.style.display =\n",
              "    google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "}\n",
              "\n",
              "        displayQuickchartButton(document);\n",
              "      </script>\n",
              "      <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-7c134bec-5265-4fea-8977-a20d4a8581da button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-7c134bec-5265-4fea-8977-a20d4a8581da');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n"
            ],
            "text/plain": [
              "          id                                             values sparse_values  \\\n",
              "0  tt5027774  [-0.12388430535793304, 0.23021861910820007, -0...          None   \n",
              "1  tt5463162  [0.008479624055325985, 0.3665461540222168, -0....          None   \n",
              "2  tt4007502  [-0.0022702165879309177, 0.5886886715888977, -...          None   \n",
              "3  tt4209788  [0.08350061625242233, 0.4322584867477417, -0.2...          None   \n",
              "4  tt2948356  [-0.1614755392074585, 0.41389355063438416, -0....          None   \n",
              "\n",
              "  metadata                                               blob  \n",
              "0     None  {'imdb_id': 'tt5027774', 'movie_id': 6705, 'po...  \n",
              "1     None  {'imdb_id': 'tt5463162', 'movie_id': 7966, 'po...  \n",
              "2     None  {'imdb_id': 'tt4007502', 'movie_id': 1614, 'po...  \n",
              "3     None  {'imdb_id': 'tt4209788', 'movie_id': 7022, 'po...  \n",
              "4     None  {'imdb_id': 'tt2948356', 'movie_id': 3571, 'po...  "
            ]
          },
          "execution_count": 2,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "from pinecone_datasets import load_dataset\n",
        "\n",
        "dataset_name = \"movielens-user-ratings\"\n",
        "dataset = load_dataset(dataset_name)\n",
        "dataset.head()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "I3rEGmwR2LTi",
        "outputId": "15c0036f-db4f-45b9-f751-4d43e014be90"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "970582"
            ]
          },
          "execution_count": 3,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "len(dataset)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "RPp6NQRhxXWo"
      },
      "source": [
        "We can limit the number of records within our `dataset` if on the Standard Tier of Pinecone (for paid users, you can index the full dataset)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "ovQo1JIGxXxr",
        "outputId": "7fd396ed-9d75-4472-817c-fd774c59c06d"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "10000"
            ]
          },
          "execution_count": 4,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "dataset = dataset.head(10_000)\n",
        "len(dataset)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "NwpTG_eJ2LTi"
      },
      "source": [
        "#### Reformatting the Dataset\n",
        "\n",
        "A `pinecone-dataset` always contains `id`, `values`, `sparse_values`, `metadata`, and `blob`. All we need are the IDs, vector embeddings (stored in `values`), and some metadata (which is actually stored in `blob`). Let's reformat the dataset ready for adding to Pinecone. We also drop `sparse_values` as they are not needed for this example.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 6,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 206
        },
        "id": "FSFtJSfL2LTi",
        "outputId": "b11f041f-7246-421d-a401-7b627d3859f4"
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "\n",
              "\n",
              "  <div id=\"df-e776b9fa-a05a-40a7-b828-495c9c5f06b5\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>values</th>\n",
              "      <th>metadata</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>tt5027774</td>\n",
              "      <td>[-0.12388430535793304, 0.23021861910820007, -0...</td>\n",
              "      <td>{'imdb_id': 'tt5027774', 'movie_id': 6705, 'po...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>tt5463162</td>\n",
              "      <td>[0.008479624055325985, 0.3665461540222168, -0....</td>\n",
              "      <td>{'imdb_id': 'tt5463162', 'movie_id': 7966, 'po...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>tt4007502</td>\n",
              "      <td>[-0.0022702165879309177, 0.5886886715888977, -...</td>\n",
              "      <td>{'imdb_id': 'tt4007502', 'movie_id': 1614, 'po...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>tt4209788</td>\n",
              "      <td>[0.08350061625242233, 0.4322584867477417, -0.2...</td>\n",
              "      <td>{'imdb_id': 'tt4209788', 'movie_id': 7022, 'po...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>tt2948356</td>\n",
              "      <td>[-0.1614755392074585, 0.41389355063438416, -0....</td>\n",
              "      <td>{'imdb_id': 'tt2948356', 'movie_id': 3571, 'po...</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-e776b9fa-a05a-40a7-b828-495c9c5f06b5')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "\n",
              "\n",
              "\n",
              "    <div id=\"df-c1ff4cb7-9ae2-4a4a-9b53-b67cc449671d\">\n",
              "      <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-c1ff4cb7-9ae2-4a4a-9b53-b67cc449671d')\"\n",
              "              title=\"Suggest charts.\"\n",
              "              style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "      </button>\n",
              "    </div>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "    background-color: #E8F0FE;\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: #1967D2;\n",
              "    height: 32px;\n",
              "    padding: 0 0 0 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: #E2EBFA;\n",
              "    box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: #174EA6;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "    background-color: #3B4455;\n",
              "    fill: #D2E3FC;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart:hover {\n",
              "    background-color: #434B5C;\n",
              "    box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "    filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "    fill: #FFFFFF;\n",
              "  }\n",
              "</style>\n",
              "\n",
              "    <script>\n",
              "      async function quickchart(key) {\n",
              "        const containerElement = document.querySelector('#' + key);\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      }\n",
              "    </script>\n",
              "\n",
              "      <script>\n",
              "\n",
              "function displayQuickchartButton(domScope) {\n",
              "  let quickchartButtonEl =\n",
              "    domScope.querySelector('#df-c1ff4cb7-9ae2-4a4a-9b53-b67cc449671d button.colab-df-quickchart');\n",
              "  quickchartButtonEl.style.display =\n",
              "    google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "}\n",
              "\n",
              "        displayQuickchartButton(document);\n",
              "      </script>\n",
              "      <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-e776b9fa-a05a-40a7-b828-495c9c5f06b5 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-e776b9fa-a05a-40a7-b828-495c9c5f06b5');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n"
            ],
            "text/plain": [
              "          id                                             values  \\\n",
              "0  tt5027774  [-0.12388430535793304, 0.23021861910820007, -0...   \n",
              "1  tt5463162  [0.008479624055325985, 0.3665461540222168, -0....   \n",
              "2  tt4007502  [-0.0022702165879309177, 0.5886886715888977, -...   \n",
              "3  tt4209788  [0.08350061625242233, 0.4322584867477417, -0.2...   \n",
              "4  tt2948356  [-0.1614755392074585, 0.41389355063438416, -0....   \n",
              "\n",
              "                                            metadata  \n",
              "0  {'imdb_id': 'tt5027774', 'movie_id': 6705, 'po...  \n",
              "1  {'imdb_id': 'tt5463162', 'movie_id': 7966, 'po...  \n",
              "2  {'imdb_id': 'tt4007502', 'movie_id': 1614, 'po...  \n",
              "3  {'imdb_id': 'tt4209788', 'movie_id': 7022, 'po...  \n",
              "4  {'imdb_id': 'tt2948356', 'movie_id': 3571, 'po...  "
            ]
          },
          "execution_count": 6,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "dataset.drop(['sparse_values', 'metadata'], axis=1, inplace=True)\n",
        "dataset.rename(columns={'blob': 'metadata'}, inplace=True)\n",
        "\n",
        "dataset.head()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "LJa96KyT2LTi"
      },
      "source": [
        "Here is an example of the metadata value."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 7,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "OPN0JTfx2LTi",
        "outputId": "0665127c-f667-48b0-d150-ccaf6751d62e"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "{'imdb_id': 'tt5027774',\n",
            " 'movie_id': 6705,\n",
            " 'poster': 'https://m.media-amazon.com/images/M/MV5BMjI0ODcxNzM1N15BMl5BanBnXkFtZTgwMzIwMTEwNDI@._V1_SX300.jpg',\n",
            " 'rating': 4.0,\n",
            " 'title': 'Three Billboards Outside Ebbing, Missouri (2017)',\n",
            " 'user_id': 4556}\n"
          ]
        }
      ],
      "source": [
        "from pprint import pp\n",
        "\n",
        "pp(dataset['metadata'][0])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XU3yufR02LTi"
      },
      "source": [
        "Now we move on to initializing our Pinecone vector database."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Creating an Index\n",
        "\n",
        "Now the data is ready, we can set up our index to store it.\n",
        "\n",
        "We begin by initializing our connection to Pinecone. To do this we need a [free API key](https://app.pinecone.io)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import os\n",
        "\n",
        "if not os.environ.get(\"PINECONE_API_KEY\"):\n",
        "    from pinecone_notebooks.colab import Authenticate\n",
        "    Authenticate()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from pinecone import Pinecone\n",
        "\n",
        "api_key = os.environ.get(\"PINECONE_API_KEY\")\n",
        "\n",
        "# configure client\n",
        "pc = Pinecone(api_key=api_key)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Now we setup our index specification, this allows us to define the cloud provider and region where we want to deploy our index. You can find a list of all [available providers and regions here](https://docs.pinecone.io/docs/projects)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from pinecone import ServerlessSpec\n",
        "\n",
        "cloud = os.environ.get('PINECONE_CLOUD') or 'aws'\n",
        "region = os.environ.get('PINECONE_REGION') or 'us-east-1'\n",
        "\n",
        "spec = ServerlessSpec(cloud=cloud, region=region)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "RyjrDHL32LTj"
      },
      "source": [
        "In order to create a new index, we need to specify the index name, similarity metric, as well as the dimension of the vectors stored in that index.\n",
        "<br>\n",
        "We will assign these values here.\n",
        "<br>\n",
        "Note that the dimension parameter has to match the embedding dimensions provided in the dataset (or the model that outputs those embeddings)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 11,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "56Yt1eV72LTj",
        "outputId": "c4ef1867-dc97-4f7b-bb26-8d5cf50d2db9"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "32"
            ]
          },
          "execution_count": 11,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "# embedding dimensions\n",
        "len(dataset['values'][0])"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 12,
      "metadata": {
        "id": "Rw-HJOBA2LTj",
        "tags": [
          "parameters"
        ]
      },
      "outputs": [],
      "source": [
        "index_name = 'onboarding-recommender'"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BLIgf3Wl2LTj"
      },
      "source": [
        "First, we need to check if the index already exists. In this example, we will delete it and create a new one."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 13,
      "metadata": {
        "id": "z78OHM0j2LTj"
      },
      "outputs": [],
      "source": [
        "if index_name in pc.list_indexes().names():\n",
        "    pc.delete_index(index_name)\n",
        "\n",
        "pc.create_index(\n",
        "        index_name,\n",
        "        dimension=32,  \n",
        "        metric='cosine',\n",
        "        spec=spec\n",
        "    )\n",
        "# wait a moment for the index to be fully initialized\n",
        "while not pc.describe_index(index_name).status['ready']:\n",
        "    time.sleep(1)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yuh_KmEI2LTj"
      },
      "source": [
        "We are going to initialize an index variable so that we can use it later on to describe the index and perform vector upsert."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 14,
      "metadata": {
        "id": "HSA6BAfl2LTj"
      },
      "outputs": [],
      "source": [
        "index = pc.Index(index_name)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VeBky_c75vEA"
      },
      "source": [
        "Initially the index will be empty:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 15,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "QG8unV5R2LTj",
        "outputId": "2beda05b-86b8-4f6e-a550-9f092f98c813"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "{'dimension': 32,\n",
              " 'index_fullness': 0.0,\n",
              " 'namespaces': {'': {'vector_count': 0}},\n",
              " 'total_vector_count': 0}"
            ]
          },
          "execution_count": 15,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "index.describe_index_stats()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "dEp-wZjc5y9b"
      },
      "source": [
        "We upsert like so:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 17,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 99,
          "referenced_widgets": [
            "89cf37ba2570433fb09b391a3c440e57",
            "0815f9bdc73a47be8e841d26ac0febc1",
            "96a1d8e445be41ef976162bb1b3b2a20",
            "cb98b1d54e4b465885561fa1f6829ead",
            "1c5aab255eb6470a8e789e040063bc98",
            "3424a2fa338c4c00afbdcb36288b2772",
            "f6177e6539064b8ba6a006385de45366",
            "87685e96656f4b2cb56bed23fd264213",
            "cfca72a8877c4b7c8f3b8abd5f4dbb30",
            "e2930a83573948a1860149d0e87d0b10",
            "2b979e3a7fcf47b0914b2143c90bb112",
            "49cb34618589477b91d64200892dd04b",
            "bed866473b984701bb1e228b0d98ed44",
            "b69378488293410b99db580c21196135",
            "846e78ae161148ff930eb2d1d23e96cd",
            "a0104a08845f4f2196b0e1dc24cc8fbd",
            "f28627a7d1ca48d3b9e5ecde1f13d116",
            "abe30251e8be4c0aa8c3285b44a31270",
            "619f0310234f4a25b8664b90a2ef0bc5",
            "47de64ce9ba54cb280721891fcfc91f3",
            "6474a1a1a92148bf9bda91b8390e3806",
            "68cdf187417740a68ff2c68ce13af3f8"
          ]
        },
        "id": "2yr3as-82LTj",
        "outputId": "aafa92cf-ed12-4f29-e192-4da2762cba20"
      },
      "outputs": [
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "89cf37ba2570433fb09b391a3c440e57",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "sending upsert requests:   0%|          | 0/10000 [00:00<?, ?it/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "49cb34618589477b91d64200892dd04b",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "collecting async responses:   0%|          | 0/10 [00:00<?, ?it/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "text/plain": [
              "upserted_count: 10000"
            ]
          },
          "execution_count": 17,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "index.upsert_from_dataframe(dataset, batch_size=1000)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "zSaHJ1YE50gl"
      },
      "source": [
        "Now we should see 10K vectors in our index (note, the `total_vector_count` may take some time to fully update, paritcularly when using the Pinecone *Standard Tier*)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 21,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "iNmaRTdV2LTj",
        "outputId": "0c4209fe-24cc-4c91-8f92-c114482d08e4"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "{'dimension': 32,\n",
              " 'index_fullness': 0.02066,\n",
              " 'namespaces': {'': {'vector_count': 2066}},\n",
              " 'total_vector_count': 2066}"
            ]
          },
          "execution_count": 21,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "index.describe_index_stats()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "O8ybkrRC2LTj"
      },
      "source": [
        "## Querying the Index"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "p2wPJwtj2LTj"
      },
      "source": [
        "Now, when the index is populated, we can perform queries on it to find the most relevant recommendations.\n",
        "<br>\n",
        "To do that, we need to instantiate our embedding models so that we can create vectors from our input user or input item objects."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-Ux4ZCIH2LTj"
      },
      "source": [
        "### Getting the Model\n",
        "\n",
        "We will download the models from the HuggingFace Hub. We will use one model to embed the *example user* and another model to embed the *example item*. <br>\n",
        "This will allow us to retrieve the most relevant items for a specific user or find the most similar items to a specific item."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 22,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 188,
          "referenced_widgets": [
            "7ab6935f6be04636bca56c1b18f48871",
            "6b9f37b8ae114c71bdd86fa591736f1a",
            "9f224dd22b134275803908cc3dc2da6d",
            "0b4193e93f954c578fef7b4c1b8e1c4e",
            "5c618580d7124f0fa1acdb03ed674acf",
            "e4fa89b2a5fe4eb98369e7bbec9462d0",
            "6c04949a90824faf8eee03fd5e721f62",
            "3ea3c0945c4549bab21e73c2c52dc3c0",
            "c35efb3b6e3b4fa5a84a0ec6aecf4290",
            "11bb72df0ba146958adaa1efddb90f71",
            "7bf9059dd14d4de5acf9e7636c4663df",
            "76a1782667ac467aa0cda9951c8fc6c7",
            "9bcd137b1c7e40c8a780e08a60d6fd0b",
            "5814caf95df24271a693caa383be4f54",
            "9147099e53d54b5eb1796ed062d819a1",
            "86305c15a8ff46d9bcac59e3ff5d068d",
            "0b78150578324fb5aa57c6c761404581",
            "6ee1c17b0155422fb7b33dc3a5421caa",
            "231d892286d8430c8dd6cad129d6e150",
            "fa65d5ef820947a2a09388eedda6d6dd",
            "ef55dfd53ed64e50a8ec69d6a75ae852",
            "a91a8079fa9a4b48840c42efd34401e6"
          ]
        },
        "id": "wiAeOiCt2LTj",
        "outputId": "d5f08d22-8163-44f0-bc3b-6058db7b2a66"
      },
      "outputs": [
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "config.json not found in HuggingFace Hub.\n",
            "WARNING:huggingface_hub.hub_mixin:config.json not found in HuggingFace Hub.\n"
          ]
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "7ab6935f6be04636bca56c1b18f48871",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "Fetching 7 files:   0%|          | 0/7 [00:00<?, ?it/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.\n",
            "config.json not found in HuggingFace Hub.\n",
            "WARNING:huggingface_hub.hub_mixin:config.json not found in HuggingFace Hub.\n"
          ]
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "76a1782667ac467aa0cda9951c8fc6c7",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "Fetching 7 files:   0%|          | 0/7 [00:00<?, ?it/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.\n"
          ]
        }
      ],
      "source": [
        "from huggingface_hub import from_pretrained_keras\n",
        "\n",
        "user_model = from_pretrained_keras(\"pinecone/movie-recommender-user-model\")\n",
        "movie_model = from_pretrained_keras(\"pinecone/movie-recommender-movie-model\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "m6iu4Ih02LTj"
      },
      "source": [
        "Before we proceed, we can create a `movie_details` dataset that we can use later on to print out the results."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 23,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 206
        },
        "id": "F6ok0Ioa2LTk",
        "outputId": "2a411766-5275-4b36-c589-5539ef1be136"
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "\n",
              "\n",
              "  <div id=\"df-e4b8631f-fe5a-48cb-a32f-d0b38471e703\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>imdb_id</th>\n",
              "      <th>movie_id</th>\n",
              "      <th>poster</th>\n",
              "      <th>rating</th>\n",
              "      <th>title</th>\n",
              "      <th>user_id</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>tt5027774</td>\n",
              "      <td>6705</td>\n",
              "      <td>https://m.media-amazon.com/images/M/MV5BMjI0OD...</td>\n",
              "      <td>4.0</td>\n",
              "      <td>Three Billboards Outside Ebbing, Missouri (2017)</td>\n",
              "      <td>4556</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>tt5463162</td>\n",
              "      <td>7966</td>\n",
              "      <td>https://m.media-amazon.com/images/M/MV5BMDkzNm...</td>\n",
              "      <td>3.5</td>\n",
              "      <td>Deadpool 2 (2018)</td>\n",
              "      <td>20798</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>tt4007502</td>\n",
              "      <td>1614</td>\n",
              "      <td>https://m.media-amazon.com/images/M/MV5BMjY3YT...</td>\n",
              "      <td>4.5</td>\n",
              "      <td>Frozen Fever (2015)</td>\n",
              "      <td>26543</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>tt4209788</td>\n",
              "      <td>7022</td>\n",
              "      <td>https://m.media-amazon.com/images/M/MV5BNTkzMz...</td>\n",
              "      <td>4.0</td>\n",
              "      <td>Molly's Game (2017)</td>\n",
              "      <td>4106</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>tt2948356</td>\n",
              "      <td>3571</td>\n",
              "      <td>https://m.media-amazon.com/images/M/MV5BOTMyMj...</td>\n",
              "      <td>4.0</td>\n",
              "      <td>Zootopia (2016)</td>\n",
              "      <td>15259</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-e4b8631f-fe5a-48cb-a32f-d0b38471e703')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "\n",
              "\n",
              "\n",
              "    <div id=\"df-18e181be-34ec-42b7-962d-112d5b4cb5ad\">\n",
              "      <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-18e181be-34ec-42b7-962d-112d5b4cb5ad')\"\n",
              "              title=\"Suggest charts.\"\n",
              "              style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "      </button>\n",
              "    </div>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "    background-color: #E8F0FE;\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: #1967D2;\n",
              "    height: 32px;\n",
              "    padding: 0 0 0 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: #E2EBFA;\n",
              "    box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: #174EA6;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "    background-color: #3B4455;\n",
              "    fill: #D2E3FC;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart:hover {\n",
              "    background-color: #434B5C;\n",
              "    box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "    filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "    fill: #FFFFFF;\n",
              "  }\n",
              "</style>\n",
              "\n",
              "    <script>\n",
              "      async function quickchart(key) {\n",
              "        const containerElement = document.querySelector('#' + key);\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      }\n",
              "    </script>\n",
              "\n",
              "      <script>\n",
              "\n",
              "function displayQuickchartButton(domScope) {\n",
              "  let quickchartButtonEl =\n",
              "    domScope.querySelector('#df-18e181be-34ec-42b7-962d-112d5b4cb5ad button.colab-df-quickchart');\n",
              "  quickchartButtonEl.style.display =\n",
              "    google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "}\n",
              "\n",
              "        displayQuickchartButton(document);\n",
              "      </script>\n",
              "      <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-e4b8631f-fe5a-48cb-a32f-d0b38471e703 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-e4b8631f-fe5a-48cb-a32f-d0b38471e703');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n"
            ],
            "text/plain": [
              "     imdb_id  movie_id                                             poster  \\\n",
              "0  tt5027774      6705  https://m.media-amazon.com/images/M/MV5BMjI0OD...   \n",
              "1  tt5463162      7966  https://m.media-amazon.com/images/M/MV5BMDkzNm...   \n",
              "2  tt4007502      1614  https://m.media-amazon.com/images/M/MV5BMjY3YT...   \n",
              "3  tt4209788      7022  https://m.media-amazon.com/images/M/MV5BNTkzMz...   \n",
              "4  tt2948356      3571  https://m.media-amazon.com/images/M/MV5BOTMyMj...   \n",
              "\n",
              "   rating                                             title  user_id  \n",
              "0     4.0  Three Billboards Outside Ebbing, Missouri (2017)     4556  \n",
              "1     3.5                                 Deadpool 2 (2018)    20798  \n",
              "2     4.5                               Frozen Fever (2015)    26543  \n",
              "3     4.0                               Molly's Game (2017)     4106  \n",
              "4     4.0                                   Zootopia (2016)    15259  "
            ]
          },
          "execution_count": 23,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "import pandas as pd\n",
        "\n",
        "movies_details = pd.DataFrame(\n",
        "    dataset['metadata'].values.tolist()\n",
        ")\n",
        "movies_details.head()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qu1fHhaK2LTk"
      },
      "source": [
        "#### Item Similarity"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hX_Gq_HM2LTk"
      },
      "source": [
        "First, we can check how our vector database behaves when returning the most similar movies upon querying it using the movie vector created using the `movie_model` loaded above."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 24,
      "metadata": {
        "id": "vyfK8B422LTk"
      },
      "outputs": [],
      "source": [
        "movie_id = 1263  # you can try experimenting with different movie ids to obtain different results, for example 3571\n",
        "movie_vector = movie_model(movie_id).numpy().tolist()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 25,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 35
        },
        "id": "9WJrgG9A2LTk",
        "outputId": "21348c1a-7413-4f84-80a0-95c6ae21d5c6"
      },
      "outputs": [
        {
          "data": {
            "application/vnd.google.colaboratory.intrinsic+json": {
              "type": "string"
            },
            "text/plain": [
              "'Avengers: Infinity War - Part I (2018)'"
            ]
          },
          "execution_count": 25,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "movies_details[movies_details['movie_id'] == movie_id]['title'].tolist()[0]"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 27,
      "metadata": {
        "id": "DUztZQW52LTs"
      },
      "outputs": [],
      "source": [
        "movie_query_results = index.query(vector=\n",
        "    movie_vector,\n",
        "    top_k=10,\n",
        "    include_metadata=True\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 30,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 381
        },
        "id": "3D4smGAl2LTs",
        "outputId": "05258680-e92e-41ab-da5b-2e0280e11d38"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Recommendations: \n"
          ]
        },
        {
          "data": {
            "text/html": [
              "\n",
              "\n",
              "  <div id=\"df-2cd247bb-b182-49d8-b0dd-ccd301a7292a\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>movies</th>\n",
              "      <th>scores</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>Avengers: Infinity War - Part I (2018)</td>\n",
              "      <td>0.998588</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>Avengers: Infinity War - Part II (2019)</td>\n",
              "      <td>0.987156</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>Thor: Ragnarok (2017)</td>\n",
              "      <td>0.981000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>Captain America: Civil War (2016)</td>\n",
              "      <td>0.979533</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>Guardians of the Galaxy (2014)</td>\n",
              "      <td>0.976064</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>Guardians of the Galaxy 2 (2017)</td>\n",
              "      <td>0.960372</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>6</th>\n",
              "      <td>Avengers: Age of Ultron (2015)</td>\n",
              "      <td>0.944494</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>7</th>\n",
              "      <td>Untitled Spider-Man Reboot (2017)</td>\n",
              "      <td>0.944431</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>8</th>\n",
              "      <td>Logan (2017)</td>\n",
              "      <td>0.936184</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>9</th>\n",
              "      <td>Spider-Man: Into the Spider-Verse (2018)</td>\n",
              "      <td>0.930237</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-2cd247bb-b182-49d8-b0dd-ccd301a7292a')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "\n",
              "\n",
              "\n",
              "    <div id=\"df-659f4e53-3c17-46a6-90f8-cc566cfe79db\">\n",
              "      <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-659f4e53-3c17-46a6-90f8-cc566cfe79db')\"\n",
              "              title=\"Suggest charts.\"\n",
              "              style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "      </button>\n",
              "    </div>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "    background-color: #E8F0FE;\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: #1967D2;\n",
              "    height: 32px;\n",
              "    padding: 0 0 0 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: #E2EBFA;\n",
              "    box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: #174EA6;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "    background-color: #3B4455;\n",
              "    fill: #D2E3FC;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart:hover {\n",
              "    background-color: #434B5C;\n",
              "    box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "    filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "    fill: #FFFFFF;\n",
              "  }\n",
              "</style>\n",
              "\n",
              "    <script>\n",
              "      async function quickchart(key) {\n",
              "        const containerElement = document.querySelector('#' + key);\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      }\n",
              "    </script>\n",
              "\n",
              "      <script>\n",
              "\n",
              "function displayQuickchartButton(domScope) {\n",
              "  let quickchartButtonEl =\n",
              "    domScope.querySelector('#df-659f4e53-3c17-46a6-90f8-cc566cfe79db button.colab-df-quickchart');\n",
              "  quickchartButtonEl.style.display =\n",
              "    google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "}\n",
              "\n",
              "        displayQuickchartButton(document);\n",
              "      </script>\n",
              "      <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-2cd247bb-b182-49d8-b0dd-ccd301a7292a button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-2cd247bb-b182-49d8-b0dd-ccd301a7292a');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n"
            ],
            "text/plain": [
              "                                     movies    scores\n",
              "0    Avengers: Infinity War - Part I (2018)  0.998588\n",
              "1   Avengers: Infinity War - Part II (2019)  0.987156\n",
              "2                     Thor: Ragnarok (2017)  0.981000\n",
              "3         Captain America: Civil War (2016)  0.979533\n",
              "4            Guardians of the Galaxy (2014)  0.976064\n",
              "5          Guardians of the Galaxy 2 (2017)  0.960372\n",
              "6            Avengers: Age of Ultron (2015)  0.944494\n",
              "7         Untitled Spider-Man Reboot (2017)  0.944431\n",
              "8                              Logan (2017)  0.936184\n",
              "9  Spider-Man: Into the Spider-Verse (2018)  0.930237"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        }
      ],
      "source": [
        "df = pd.DataFrame(\n",
        "    {\n",
        "        'movies': [record.metadata['title'] for record in movie_query_results.matches],\n",
        "        'scores': [record.score for record in movie_query_results.matches]\n",
        "    }\n",
        ")\n",
        "print(\"Recommendations: \")\n",
        "display(df)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "sXvf7mU22LTs"
      },
      "source": [
        "We can observe that it is doing an excellent job in finding similar movies, and it is accomplishing this task very quickly."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "m-cwktMp2LTs"
      },
      "source": [
        "#### User Recommendations"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "axrgCNTH2LTs"
      },
      "source": [
        "Now, let's observe how our vector database behaves when we query it using the user vector.\n",
        "<br>\n",
        "We expect to receive movies that closely resemble the ones that the user rated highly."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 91,
      "metadata": {
        "id": "M8IPoq1Z2LTs"
      },
      "outputs": [],
      "source": [
        "user_id = 825\n",
        "user_vector = user_model(user_id).numpy().tolist()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8sz0bmOz2LTs"
      },
      "source": [
        "Here, we are defining a function that allows us to easily display the movies that the user rated in the past."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 92,
      "metadata": {
        "id": "azkQAcNh2LTs"
      },
      "outputs": [],
      "source": [
        "def top_movies_user_rated(user):\n",
        "    # get list of movies that the user has rated\n",
        "    user_movies = movies_details[movies_details[\"user_id\"] == user]\n",
        "    # order by their top rated movies\n",
        "    top_rated = user_movies.sort_values(by=['rating'], ascending=False)\n",
        "    # return the top 14 movies\n",
        "    return pd.DataFrame(\n",
        "        {\n",
        "            'movies': top_rated['title'].tolist()[:14],\n",
        "            'ratings': top_rated['rating'].tolist()[:14]\n",
        "        }\n",
        "    )"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 93,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 332
        },
        "id": "IcvnRxMT2LTs",
        "outputId": "13a662b6-37af-4589-82a3-d061cdd230ca"
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "\n",
              "\n",
              "  <div id=\"df-8b4fd6e2-2cf3-44b3-a96f-0d2e882e9411\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>movies</th>\n",
              "      <th>ratings</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>Swiss Army Man (2016)</td>\n",
              "      <td>4.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>Blackfish (2013)</td>\n",
              "      <td>3.5</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>Raman Raghav 2.0 (2016)</td>\n",
              "      <td>3.5</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>Loveless (2017)</td>\n",
              "      <td>3.5</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>Ae Dil Hai Mushkil (2016)</td>\n",
              "      <td>3.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>Ted 2 (2015)</td>\n",
              "      <td>3.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>6</th>\n",
              "      <td>Inside Llewyn Davis (2013)</td>\n",
              "      <td>3.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>7</th>\n",
              "      <td>Drinking Buddies (2013)</td>\n",
              "      <td>2.5</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>8</th>\n",
              "      <td>Yamla Pagla Deewana 2 (2013)</td>\n",
              "      <td>1.5</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-8b4fd6e2-2cf3-44b3-a96f-0d2e882e9411')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "\n",
              "\n",
              "\n",
              "    <div id=\"df-b9de030a-e735-4624-8f77-a0bab3ef8bfc\">\n",
              "      <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-b9de030a-e735-4624-8f77-a0bab3ef8bfc')\"\n",
              "              title=\"Suggest charts.\"\n",
              "              style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "      </button>\n",
              "    </div>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "    background-color: #E8F0FE;\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: #1967D2;\n",
              "    height: 32px;\n",
              "    padding: 0 0 0 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: #E2EBFA;\n",
              "    box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: #174EA6;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "    background-color: #3B4455;\n",
              "    fill: #D2E3FC;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart:hover {\n",
              "    background-color: #434B5C;\n",
              "    box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "    filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "    fill: #FFFFFF;\n",
              "  }\n",
              "</style>\n",
              "\n",
              "    <script>\n",
              "      async function quickchart(key) {\n",
              "        const containerElement = document.querySelector('#' + key);\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      }\n",
              "    </script>\n",
              "\n",
              "      <script>\n",
              "\n",
              "function displayQuickchartButton(domScope) {\n",
              "  let quickchartButtonEl =\n",
              "    domScope.querySelector('#df-b9de030a-e735-4624-8f77-a0bab3ef8bfc button.colab-df-quickchart');\n",
              "  quickchartButtonEl.style.display =\n",
              "    google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "}\n",
              "\n",
              "        displayQuickchartButton(document);\n",
              "      </script>\n",
              "      <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-8b4fd6e2-2cf3-44b3-a96f-0d2e882e9411 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-8b4fd6e2-2cf3-44b3-a96f-0d2e882e9411');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n"
            ],
            "text/plain": [
              "                         movies  ratings\n",
              "0         Swiss Army Man (2016)      4.0\n",
              "1              Blackfish (2013)      3.5\n",
              "2       Raman Raghav 2.0 (2016)      3.5\n",
              "3               Loveless (2017)      3.5\n",
              "4     Ae Dil Hai Mushkil (2016)      3.0\n",
              "5                  Ted 2 (2015)      3.0\n",
              "6    Inside Llewyn Davis (2013)      3.0\n",
              "7       Drinking Buddies (2013)      2.5\n",
              "8  Yamla Pagla Deewana 2 (2013)      1.5"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        }
      ],
      "source": [
        "display(top_movies_user_rated(user_id))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "aFzlItA92LTs"
      },
      "source": [
        "And now we can pass our `user_vector` to the query to get the recommendations."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 94,
      "metadata": {
        "id": "4MctHWEU2LTs"
      },
      "outputs": [],
      "source": [
        "query_results = index.query(vector=\n",
        "    user_vector,\n",
        "    top_k=10,\n",
        "    include_metadata=True\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 95,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 381
        },
        "id": "1pvCntTP2LTs",
        "outputId": "0d0a2262-0645-44d2-8567-de3f379d1804"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Recommendations: \n"
          ]
        },
        {
          "data": {
            "text/html": [
              "\n",
              "\n",
              "  <div id=\"df-2a66f21c-b7fb-4b3c-b8ba-05d3e88dd258\">\n",
              "    <div class=\"colab-df-container\">\n",
              "      <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>movies</th>\n",
              "      <th>scores</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>12 Years a Slave (2013)</td>\n",
              "      <td>0.868902</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>Bridegroom (2013)</td>\n",
              "      <td>0.850303</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>Capital C (2015)</td>\n",
              "      <td>0.850294</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>Dangal (2016)</td>\n",
              "      <td>0.843546</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>The Crew (2016)</td>\n",
              "      <td>0.841317</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>Capernaum (2018)</td>\n",
              "      <td>0.840019</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>6</th>\n",
              "      <td>Shaadi Mein Zaroor Aana (2017)</td>\n",
              "      <td>0.832370</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>7</th>\n",
              "      <td>Tanu Weds Manu Returns (2015)</td>\n",
              "      <td>0.829788</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>8</th>\n",
              "      <td>Spider-Man: Far from Home (2019)</td>\n",
              "      <td>0.826584</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>9</th>\n",
              "      <td>Avengers: Infinity War - Part II (2019)</td>\n",
              "      <td>0.824928</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "      <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-2a66f21c-b7fb-4b3c-b8ba-05d3e88dd258')\"\n",
              "              title=\"Convert this dataframe to an interactive table.\"\n",
              "              style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M0 0h24v24H0V0z\" fill=\"none\"/>\n",
              "    <path d=\"M18.56 5.44l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94zm-11 1L8.5 8.5l.94-2.06 2.06-.94-2.06-.94L8.5 2.5l-.94 2.06-2.06.94zm10 10l.94 2.06.94-2.06 2.06-.94-2.06-.94-.94-2.06-.94 2.06-2.06.94z\"/><path d=\"M17.41 7.96l-1.37-1.37c-.4-.4-.92-.59-1.43-.59-.52 0-1.04.2-1.43.59L10.3 9.45l-7.72 7.72c-.78.78-.78 2.05 0 2.83L4 21.41c.39.39.9.59 1.41.59.51 0 1.02-.2 1.41-.59l7.78-7.78 2.81-2.81c.8-.78.8-2.07 0-2.86zM5.41 20L4 18.59l7.72-7.72 1.47 1.35L5.41 20z\"/>\n",
              "  </svg>\n",
              "      </button>\n",
              "\n",
              "\n",
              "\n",
              "    <div id=\"df-0b4c902d-ac84-4b20-97dd-4ac8f94f17a6\">\n",
              "      <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-0b4c902d-ac84-4b20-97dd-4ac8f94f17a6')\"\n",
              "              title=\"Suggest charts.\"\n",
              "              style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "      </button>\n",
              "    </div>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "    background-color: #E8F0FE;\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: #1967D2;\n",
              "    height: 32px;\n",
              "    padding: 0 0 0 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: #E2EBFA;\n",
              "    box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: #174EA6;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "    background-color: #3B4455;\n",
              "    fill: #D2E3FC;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart:hover {\n",
              "    background-color: #434B5C;\n",
              "    box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "    filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "    fill: #FFFFFF;\n",
              "  }\n",
              "</style>\n",
              "\n",
              "    <script>\n",
              "      async function quickchart(key) {\n",
              "        const containerElement = document.querySelector('#' + key);\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      }\n",
              "    </script>\n",
              "\n",
              "      <script>\n",
              "\n",
              "function displayQuickchartButton(domScope) {\n",
              "  let quickchartButtonEl =\n",
              "    domScope.querySelector('#df-0b4c902d-ac84-4b20-97dd-4ac8f94f17a6 button.colab-df-quickchart');\n",
              "  quickchartButtonEl.style.display =\n",
              "    google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "}\n",
              "\n",
              "        displayQuickchartButton(document);\n",
              "      </script>\n",
              "      <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      flex-wrap:wrap;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "      <script>\n",
              "        const buttonEl =\n",
              "          document.querySelector('#df-2a66f21c-b7fb-4b3c-b8ba-05d3e88dd258 button.colab-df-convert');\n",
              "        buttonEl.style.display =\n",
              "          google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "        async function convertToInteractive(key) {\n",
              "          const element = document.querySelector('#df-2a66f21c-b7fb-4b3c-b8ba-05d3e88dd258');\n",
              "          const dataTable =\n",
              "            await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                     [key], {});\n",
              "          if (!dataTable) return;\n",
              "\n",
              "          const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "            '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "            + ' to learn more about interactive tables.';\n",
              "          element.innerHTML = '';\n",
              "          dataTable['output_type'] = 'display_data';\n",
              "          await google.colab.output.renderOutput(dataTable, element);\n",
              "          const docLink = document.createElement('div');\n",
              "          docLink.innerHTML = docLinkHtml;\n",
              "          element.appendChild(docLink);\n",
              "        }\n",
              "      </script>\n",
              "    </div>\n",
              "  </div>\n"
            ],
            "text/plain": [
              "                                    movies    scores\n",
              "0                  12 Years a Slave (2013)  0.868902\n",
              "1                        Bridegroom (2013)  0.850303\n",
              "2                         Capital C (2015)  0.850294\n",
              "3                            Dangal (2016)  0.843546\n",
              "4                          The Crew (2016)  0.841317\n",
              "5                         Capernaum (2018)  0.840019\n",
              "6           Shaadi Mein Zaroor Aana (2017)  0.832370\n",
              "7            Tanu Weds Manu Returns (2015)  0.829788\n",
              "8         Spider-Man: Far from Home (2019)  0.826584\n",
              "9  Avengers: Infinity War - Part II (2019)  0.824928"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        }
      ],
      "source": [
        "df = pd.DataFrame(\n",
        "    {\n",
        "        'movies': [record.metadata['title'] for record in query_results.matches],\n",
        "        'scores': [record.score for record in query_results.matches]\n",
        "    }\n",
        ")\n",
        "print(\"Recommendations: \")\n",
        "display(df)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "91FgZa1g2LTt"
      },
      "source": [
        "Using this method, we can identify movies similar to those the user rated highly, while avoiding those the user rated lowly."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3j6Zgk8k2LTt"
      },
      "source": [
        "## Summary\n",
        "\n",
        "The notebook demonstrated the step-by-step process of creating and populating an index in the vector database. It covered aspects such as specifying the index name, similarity metric, and vector dimensions. The example also included instructions on checking if an index exists, deleting and creating new indexes when necessary.\n",
        "\n",
        "Furthermore, the notebook illustrated the usage of embedding models to generate vector representations of both users and items. The results showed that the recommendations closely resembled the movies that the user rated highly, and dissimilar movies were not included.\n",
        "\n",
        "Overall, this example showcased the power and efficiency of vector databases in recommendation systems. It is important to note that the benefits of vector databases extend beyond movies and can be applied to various types of items, making them a valuable tool in building effective recommendation systems."
      ]
    }
  ],
  "metadata": {
    "colab": {
      "provenance": []
    },
    "kernelspec": {
      "display_name": "pinecone-onboarding",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.8.16"
    },
    "orig_nbformat": 4
  },
  "nbformat": 4,
  "nbformat_minor": 0
}